Compute the probability of observing this dataset given a proposed set of parameters.
y <- c(4, 0, 1)
likelihood <- function (lambda) {
probs <- dpois(y, lambda)
prod(probs)
}likelihood(1)## [1] 0.002074461
likelihood(0.5)## [1] 0.0002905341
Keep trying new parameters until you find the most likely set
objective <- function(lambda) {
-1 * likelihood(lambda)
}
optim(c(lambda = 0), objective)$par## lambda
## 1.666667
Compute the likelihood at a range of possible values
lambda <- seq(0, 10, length.out = 100)
likelihood <- sapply(lambda, likelihood)Considers the probability distribution over data, given parameters, but then treats this surface kind-of like a probability distribution over parameters
these assume the likelihood surface is normally distributed (it isn’t)
“Were this procedure to be repeated on numerous samples, the fraction of calculated confidence intervals (which would differ for each sample) that encompass the true population parameter would tend toward 90%.”
Considers the probability distribution over data parameters, given parameters data
\[ p(\text{parameters} | \text{data}) = \frac{p(\text{data} | \text{parameters}) \times p(\text{parameters})}{p(\text{data})} \]
Considers the probability distribution over data parameters, given parameters data
\[ p(\text{parameters} | \text{data}) = \frac{p(\text{data} | \text{parameters}) \times p(\text{parameters})}{☹} \]
go to the whiteboard!
\[ p(\text{parameters} | \text{data}) = \frac{p(\text{data} | \text{parameters}) \times p(\text{parameters})}{☹} \]
Because of the \(☹\) it’s bit tricky estimate the parameters of the posterior
Instead, we can draw random samples from the posterior. With enough samples, we can estimate the parameters
Unfortunately, we don’t have a random number generator for every model.
Markov chain Monte Carlo gives us a time series of correlated random numbers from our distribution
One approach to dealing with multiple candidate models is to average them, based on how good they are
Bayesian inference does this automatically!
We get lots of different models, weighted according to how probable they are